Analyzing risk behaviors of youth

Introduction-

Risky behaviors are acts that increase the risk of disease or injury, which could eventually threaten health or even life. Especially, potentially risky behaviors that youths engage in will impact their well-being and life prospects (Gruber, 2001). Activities such as smoking, consuming alcohol, having sex, and taking drugs will cause consequences for the remainder of their lives. Therefore, our group would like to gain a better understanding of youth risk behavior patterns and draw insights that can help teenagers create lifelong healthy behaviors. Our project builds a surveillance system that analyzes three categories of health risk behavior among the youth:

We obtained a data set, which contains 2,740,200 rows and 35 columns of observations, from Kaggle. It is about a Youth Risk Behavior Surveillance System (YRBSS) that conducts surveys to collect information from high school students in terms of adverse health behaviors in over 100 schools in the United States. The survey data ranges from 1991 to 2017 and associates a risk percentage to specific health-related issues over various demographic categories such as race, grade, sex, and location.

Based on the data set, we come up with some meaningful questions and would like to explore more:

By analyzing the dataset, we will be able to find some helpful relationships between behaviors and experiences among high school students and figure out the change patterns of risky behaviors over time and place. Based on that, we can provide some insights and suggestions on how to enhance the current law enforcement and improve legal regulations to prevent high school students from engaging in risky behaviors. Furthermore, our system will prove beneficial for giving guidance to various non-profit organizations on helping and protecting at-risk young adults and teenagers in the United States.

Choice for Heavier Grading on Data Processing-

Our group make the decision that our project should be graded more heavily on data processing. The reason why we believe the work we did goes above and beyond the basic data processing needed for most data sets is that our data set has three topics in separate sheets and we have made a great effort to clean and process each sheet before merging them into one. We spent time understanding this huge data set before we drop any irrelevant information. Then we conducted the tasks such as recording, reindexing and changing data types to further process the data in each sheet.